A Deep Semantic Alignment Network for the Cross-Modal Image-Text Retrieval in Remote Sensing

نویسندگان

چکیده

Because of the rapid growth multimodal data from internet and social media, a cross-modal retrieval has become an important valuable task in recent years.The purpose is to obtain result one modality (e.g., image), which semantically similar query another text).In field remote sensing, despite great number existing works on image retrieval, there only been small amount research image-text due scarcity datasets complicated characteristics sensing data. In this article, we introduce novel network establish direct relationship between images their paired text Specifically, our framework, designed semantic alignment module fully explore latent correspondence text, used attention gate mechanisms filter optimize features so that more discriminative feature representations can be obtained. Experimental results four benchmark datasets, including UCMerced-LandUse-Captions, Sydney-Captions, RSICD, NWPU-RESISC45-Captions, well showed proposed method outperformed other baselines achieved state-of-the-art performance tasks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Deep Semantic Embeddings for Cross-Modal Retrieval

Deep learning methods have been actively researched for cross-modal retrieval, with the softmax cross-entropy loss commonly applied for supervised learning. However, the softmax cross-entropy loss is known to result in large intra-class variances, which is not not very suited for cross-modal matching. In this paper, a deep architecture called Deep Semantic Embedding (DSE) is proposed, which is ...

متن کامل

Cross-modal Retrieval by Text and Image Feature Biclustering

We describe our approach to the ImageCLEF-Photo 2007 task. The novelty of our method consists of biclustering image segments and annotation words. Given the query words, we may select the image segment clusters that have strongest cooccurrence with the corresponding word clusters. These image segment clusters act as the selected segments relevant to a query. We rank text hits by our own tf.idf ...

متن کامل

A Radon-based Convolutional Neural Network for Medical Image Retrieval

Image classification and retrieval systems have gained more attention because of easier access to high-tech medical imaging. However, the lack of availability of large-scaled balanced labelled data in medicine is still a challenge. Simplicity, practicality, efficiency, and effectiveness are the main targets in medical domain. To achieve these goals, Radon transformation, which is a well-known t...

متن کامل

Cross-modal domain adaptation for text-based regularization of image semantics in image retrieval systems

In query-by-semantic-example image retrieval, images are ranked by similarity of semantic descriptors. These descriptors are obtained by classifying each image with respect to a pre-defined vocabulary of semantic concepts. In this work, we consider the problem of improving the accuracy of semantic descriptors through cross-modal regularization, based on auxiliary text. A cross-modal regularizer...

متن کامل

Semiautomatic Image Retrieval Using the High Level Semantic Labels

Content-based image retrieval and text-based image retrieval are two fundamental approaches in the field of image retrieval. The challenges related to each of these approaches, guide the researchers to use combining approaches and semi-automatic retrieval using the user interaction in the retrieval cycle. Hence, in this paper, an image retrieval system is introduced that provided two kind of qu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing

سال: 2021

ISSN: ['2151-1535', '1939-1404']

DOI: https://doi.org/10.1109/jstars.2021.3070872